Efficient Estimation of the number of neighbours in Probabilistic K Nearest Neighbour Classification

نویسندگان

  • Ji Won Yoon
  • Nial Friel
چکیده

Probabilistic k-nearest neighbour (PKNN) classification has been introduced to improve the performance of original k-nearest neighbour (KNN) classification algorithm by explicitly modelling uncertainty in the classification of each feature vector. However, an issue common to both KNN and PKNN is to select the optimal number of neighbours, k. The contribution of this paper is to incorporate the uncertainty in k into the decision making, and in so doing use Bayesian model averaging to provide improved classification. Indeed the problem of assessing the uncertainty in k can be viewed as one of statistical model selection which is one of the most important technical issues in the statistics and machine learning domain. In this paper, a new functional approximation algorithm is proposed to reconstruct the density of the model (order) without relying on time consuming Monte Carlo simulations. In addition, this algorithm avoids cross validation by adopting Bayesian framework. The performance of this algorithm yielded very good performance on several real experimental datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient model selection for probabilistic K nearest neighbour classification

ProbabilisticK-nearest neighbour (PKNN) classification has been introduced to improve the performance of the original K-nearest neighbour (KNN) classification algorithm by explicitly modelling uncertainty in the classification of each feature vector. However, an issue common to both KNN and PKNN is to select the optimal number of neighbours, K. The contribution of this paper is to incorporate t...

متن کامل

An Accuracy-Assured Privacy-Preserving Recommender System for Internet Commerce

Recommender systems, tool for predicting users’ potential preferences by computing history data and users’ interests, show an increasing importance in various Internet applications such as online shopping. As a well-known recommendation method, neighbourhood-based collaborative filtering has attracted considerable attentions recently. The risk of revealing users’ private information during the ...

متن کامل

An efficient weighted nearest neighbour classifier using vertical data representation

The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algorithm, called PINE, using vertical data representation. A metric called HOBBit is used as the distance metric. The PINE algorithm applies a Gaussian podium function to set weights to different neighbours. We compare PIN...

متن کامل

Incorporating Farthest Neighbours in Instance Space Classification

The nearest neighbour (NN) classifier is often known as a ‘lazy’ approach but it is still widely used particularly in the systems that require pattern matching. Many algorithms have been developed based on NN in an attempt to improve classification accuracy and to reduce the time taken, especially in large data sets. This paper proposes a new classification technique based on kNearest Neighbour...

متن کامل

k-Nearest Neighbour Classifiers

Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier – classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance today because issues of poor run-time performance is not such...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1305.1002  شماره 

صفحات  -

تاریخ انتشار 2013